Sum-Max Monotonic Ranked Joins for Evaluating Top-K Twig Queries on Weighted Data Graphs
نویسندگان
چکیده
In many applications, the underlying data (the web, an XML document, or a relational database) can be seen as a graph. These graphs may be enriched with weights, associated with the nodes and edges of the graph, denoting application specific desirability/penalty assessments, such as popularity, trust, or cost. A particular challenge when considering such weights in query processing is that results need to be ranked accordingly. Answering keyword-based queries on weighted graphs is shown to be computationally expensive. In this paper, we first show that answering queries with further structure imposed on them remains NP-hard. We next show that, while the query evaluation task can be viewed in terms of ranked structural-joins along query axes, the monotonicity property, necessary for ranked join algorithms, is violated. Consequently, traditional ranked join algorithms are not directly applicable. Thus, we establish an alternative, sum-max monotonicity property and show how to leverage this for developing a self-punctuating, horizonbased ranked join (HR-Join) operator for ranked twig-query execution on data graphs. We experimentally show the effectiveness of the proposed evaluation schemes and the HRjoin operator for merging ranked sub-results under sum-max monotonicity.
منابع مشابه
Twig Patterns: From XML Trees to Graphs
Existing approaches for querying XML (e.g., XPath and twig patterns) assume that the data form a tree. Often, however, XML documents have a graph structure, due to ID references. The common way of adapting known techniques to XML graphs is straightforward, but may result in a huge number of results, where only a small portion of them has valuable information. We propose two mechanisms. Filterin...
متن کاملFast Evaluation of Multi-source Star Twig Queries in a Path Materialization-based XML Database
Despite a large body of work on xml twig query processing in relational environment, systematic study of xml join evaluation has received little attention in the literature. In this paper, we propose a novel and non-traditional technique for fast evaluation of multi-source star twig queries in a path materialization-based rdbms. A multi-source star twig joins different xml documents on values i...
متن کاملIndexing Schemes for Efficient Aggregate Computation over Structural Joins
With the increasing popularity of XML as a standard for data representation and exchange, efficient XML query processing has become a necessity. One popular approach encodes the hierarchical structure of XML data through a node numbering scheme, thus reducing typical queries to special forms (structural, path, twig) of containment joins. In this paper we consider how using an index can facilita...
متن کاملA Hybrid Approach for General XML Query Processing
The state-of-the-art XML twig pattern query processing algorithms focus on matching a single twig pattern to a document. However, many practical queries are modeled by multiple twig patterns with joins to link them. The output of twig pattern matching is tuples of labels, while the joins between twig patterns are based on values. The inefficiency of integrating label-based structural joins in t...
متن کاملQuickStack: A Fast Algorithm for XML Query Matching
With the increasing popularity of XML for data representation and exchange, much research has been done for providing an efficient way to evaluate twig patterns in an XML database. As a result, many holistic join algorithms have been developed, most of which are derivatives of the well-known TwigStack algorithm. However, these algorithms still apply a two phase processing scheme: first identify...
متن کامل